74 research outputs found

    Self-Supervised Sketch-to-Image Synthesis

    Full text link
    Imagining a colored realistic image from an arbitrarily drawn sketch is one of the human capabilities that we eager machines to mimic. Unlike previous methods that either requires the sketch-image pairs or utilize low-quantity detected edges as sketches, we study the exemplar-based sketch-to-image (s2i) synthesis task in a self-supervised learning manner, eliminating the necessity of the paired sketch data. To this end, we first propose an unsupervised method to efficiently synthesize line-sketches for general RGB-only datasets. With the synthetic paired-data, we then present a self-supervised Auto-Encoder (AE) to decouple the content/style features from sketches and RGB-images, and synthesize images that are both content-faithful to the sketches and style-consistent to the RGB-images. While prior works employ either the cycle-consistence loss or dedicated attentional modules to enforce the content/style fidelity, we show AE's superior performance with pure self-supervisions. To further improve the synthesis quality in high resolution, we also leverage an adversarial network to refine the details of synthetic images. Extensive experiments on 1024*1024 resolution demonstrate a new state-of-art-art performance of the proposed model on CelebA-HQ and Wiki-Art datasets. Moreover, with the proposed sketch generator, the model shows a promising performance on style mixing and style transfer, which require synthesized images to be both style-consistent and semantically meaningful. Our code is available on https://github.com/odegeasslbc/Self-Supervised-Sketch-to-Image-Synthesis-PyTorch, and please visit https://create.playform.io/my-projects?mode=sketch for an online demo of our model.Comment: AAAI-202

    BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning

    Full text link
    An ever increasing number of configuration parameters are provided to system users. But many users have used one configuration setting across different workloads, leaving untapped the performance potential of systems. A good configuration setting can greatly improve the performance of a deployed system under certain workloads. But with tens or hundreds of parameters, it becomes a highly costly task to decide which configuration setting leads to the best performance. While such task requires the strong expertise in both the system and the application, users commonly lack such expertise. To help users tap the performance potential of systems, we present BestConfig, a system for automatically finding a best configuration setting within a resource limit for a deployed system under a given application workload. BestConfig is designed with an extensible architecture to automate the configuration tuning for general systems. To tune system configurations within a resource limit, we propose the divide-and-diverge sampling method and the recursive bound-and-search algorithm. BestConfig can improve the throughput of Tomcat by 75%, that of Cassandra by 63%, that of MySQL by 430%, and reduce the running time of Hive join job by about 50% and that of Spark join job by about 80%, solely by configuration adjustment

    Diffusion Guided Domain Adaptation of Image Generators

    Full text link
    Can a text-to-image diffusion model be used as a training objective for adapting a GAN generator to another domain? In this paper, we show that the classifier-free guidance can be leveraged as a critic and enable generators to distill knowledge from large-scale text-to-image diffusion models. Generators can be efficiently shifted into new domains indicated by text prompts without access to groundtruth samples from target domains. We demonstrate the effectiveness and controllability of our method through extensive experiments. Although not trained to minimize CLIP loss, our model achieves equally high CLIP scores and significantly lower FID than prior work on short prompts, and outperforms the baseline qualitatively and quantitatively on long and complicated prompts. To our best knowledge, the proposed method is the first attempt at incorporating large-scale pre-trained diffusion models and distillation sampling for text-driven image generator domain adaptation and gives a quality previously beyond possible. Moreover, we extend our work to 3D-aware style-based generators and DreamBooth guidance.Comment: Project website: https://styleganfusion.github.io

    TIME: Text and Image Mutual-Translation Adversarial Networks

    Full text link
    Focusing on text-to-image (T2I) generation, we propose Text and Image Mutual-Translation Adversarial Networks (TIME), a lightweight but effective model that jointly learns a T2I generator G and an image captioning discriminator D under the Generative Adversarial Network framework. While previous methods tackle the T2I problem as a uni-directional task and use pre-trained language models to enforce the image--text consistency, TIME requires neither extra modules nor pre-training. We show that the performance of G can be boosted substantially by training it jointly with D as a language model. Specifically, we adopt Transformers to model the cross-modal connections between the image features and word embeddings, and design an annealing conditional hinge loss that dynamically balances the adversarial learning. In our experiments, TIME achieves state-of-the-art (SOTA) performance on the CUB and MS-COCO dataset (Inception Score of 4.91 and Fr\'echet Inception Distance of 14.3 on CUB), and shows promising performance on MS-COCO on image captioning and downstream vision-language tasks.Comment: AAAI-202

    How Do Price and Quantity Promotions Affect Hedonic Purchases? An ERPs Study

    Get PDF
    Due to consuming hedonic products unnecessary to basic well-being, consumers need justifications for pleasure. However, different justifications have differential influences in promoting hedonic purchases, such as price and quantity promotions (PP and QP), the difference being that the latter requires purchasing additional units to get the same discount as the former. In the present study, even-related potentials (ERPs) was applied to reveal the timing of brain activities to further understand how promotion information consisting of promotion type (PP and QP) and discount depth, deep and shallow discounts (DD and SD) on hedonic products was processed. Behaviorally, consumers were more willing to purchase items in PP and DD conditions than QP and SD conditions, respectively, and spent more time making final purchase decisions in QP and DD condition or PP and SD condition compared to PP and DD condition. Neurophysiologically, DD automatically recruited more attentional resources than SD and led to a higher P2 amplitude. QP and DD condition or PP and SD condition evoked a larger N2 amplitude and enhanced perceptual conflict compared to PP and DD condition. During late stage, PP and DD elicited a more positive LPP amplitude in contrast to QP and SD, respectively, indicating that people have stronger purchase intention and positive affect in PP and DD contexts. These findings provided evidence for the differential influences between PP and QP and what ultimately made consumers buy hedonic products or not

    Improving Negative-Prompt Inversion via Proximal Guidance

    Full text link
    DDIM inversion has revealed the remarkable potential of real image editing within diffusion-based methods. However, the accuracy of DDIM reconstruction degrades as larger classifier-free guidance (CFG) scales being used for enhanced editing. Null-text inversion (NTI) optimizes null embeddings to align the reconstruction and inversion trajectories with larger CFG scales, enabling real image editing with cross-attention control. Negative-prompt inversion (NPI) further offers a training-free closed-form solution of NTI. However, it may introduce artifacts and is still constrained by DDIM reconstruction quality. To overcome these limitations, we propose Proximal Negative-Prompt Inversion (ProxNPI), extending the concepts of NTI and NPI. We enhance NPI with a regularization term and reconstruction guidance, which reduces artifacts while capitalizing on its training-free nature. Our method provides an efficient and straightforward approach, effectively addressing real image editing tasks with minimal computational overhead.Comment: Code at https://github.com/phymhan/prompt-to-promp

    Effect of Temperature on Electromagnetic Performance of Active Phased Array Antenna

    Get PDF
    Active phased array antennas (APAAs) can suffer from the effects of harsh thermal environments, which are caused by the large quantity of power generated by densely packed T/R modules and external thermal impacts. The situation may be worse in the case of limited room and severe thermal loads, due to heat radiation and a low temperature sink. The temperature field of the antenna can be changed. Since large numbers of temperature-sensitive electronic components exist in T/R modules, excitation current output can be significantly affected and the electromagnetic performance of APAAs can be seriously degraded. However, due to a lack of quantitative analysis, it is difficult to directly estimate the effect of temperature on the electromagnetic performance of APAAs. Therefore, this study investigated the electromagnetic performance of APAAs as affected by two key factors—the uniformly distributed temperature field and the temperature gradient field—based on different antenna shapes and sizes, to provide theoretical guidance for their thermal design

    Strongly Secure Authenticated Key Exchange from Supersingular Isogenies

    Get PDF
    This paper aims to address the open problem, namely, to find new techniques to design and prove security of supersingular isogeny-based authenticated key exchange (AKE) protocols against the widest possible adversarial attacks, raised by Galbraith in 2018. Concretely, we present two AKEs based on a double-key PKE in the supersingular isogeny setting secure in the sense of CK+^+, one of the strongest security models for AKE. Our contributions are summarised as follows. Firstly, we propose a strong OW-CPA\textsf{OW-CPA} secure PKE, 2PKEsidh\mathsf{2PKE_{sidh}}, based on SI-DDH assumption. By applying modified Fujisaki-Okamoto transformation, we obtain a [OW-CCA,OW-CPA][\textsf{OW-CCA}, \textsf{OW-CPA}] secure KEM, 2KEMsidh\mathsf{2KEM_{sidh}}. Secondly, we propose a two-pass AKE, SIAKE2\textsf{SIAKE}_2, based on SI-DDH assumption, using 2KEMsidh\mathsf{2KEM_{sidh}} as a building block. Thirdly, we present a modified version of 2KEMsidh\mathsf{2KEM_{sidh}} that is secure against leakage under the 1-Oracle SI-DH assumption. Using the modified 2KEMsidh\mathsf{2KEM_{sidh}} as a building block, we then propose a three-pass AKE, SIAKE3\textsf{SIAKE}_3, based on 1-Oracle SI-DH assumption. Finally, we prove that both SIAKE2\textsf{SIAKE}_2 and SIAKE3\textsf{SIAKE}_3 are CK+^+ secure in the random oracle model and supports arbitrary registration. We also provide an implementation to illustrate the efficiency of our schemes. Our schemes compare favourably against existing isogeny-based AKEs. To the best of our knowledge, they are the first of its kind to offer security against arbitrary registration, wPFS, KCI and MEX simultaneously. Regarding efficiency, our schemes outperform existing schemes in terms of bandwidth as well as CPU cycle count
    • …
    corecore